Advanced SQL Statements
SQL processes queries in this logical order:
FROM→ get the data sources.WHERE→ filter individual rows.GROUP BY→ group rows.HAVING→ filter groups.SELECT→ calculate aggregates and select columns.ORDER BY→ sort the result set.LIMIT/OFFSET(orFETCH FIRST N ROWS) → return only a subset of rows.
Aggregate Functions
Aggregate functions in SQL are functions that perform calculations on multiple rows of data and return a single value.
They are often used with the GROUP BY clause to perform calculations on grouped data.
Common Aggregate Functions
| Function | Description |
|---|---|
COUNT() | Counts the number of rows (or non-NULL values if column is specified) |
SUM() | Calculates the total of a numeric column |
AVG() | Calculates the average of a numeric column |
MIN() | Finds the smallest value in a column |
MAX() | Finds the largest value in a column |
General Syntax
SELECT AGGREGATE_FUNCTION(column_name)
FROM table_name
WHERE condition;
- Can be combined with
GROUP BYto apply per group. - Can be combined with
HAVINGto filter groups based on aggregated values.
Example
| order_id | customer_id | product | quantity | price | order_date |
|---|---|---|---|---|---|
| 1 | C001 | Laptop | 1 | 1000 | 2025-08-01 |
| 2 | C002 | Mouse | 2 | 25 | 2025-08-01 |
| 3 | C001 | Keyboard | 1 | 45 | 2025-08-02 |
| 4 | C003 | Laptop | 1 | 950 | 2025-08-02 |
| 5 | C002 | Mouse | 3 | 25 | 2025-08-03 |
Problem: For each customer, how many orders have they placed, and how many total items have they bought?
Solution:
SELECT customer_id,
COUNT(*) AS total_orders,
SUM(quantity) AS total_items
FROM orders
GROUP BY customer_id;
Output
| customer_id | total_orders | total_items |
|---|---|---|
| C001 | 2 | 2 |
| C002 | 2 | 5 |
| C003 | 1 | 1 |
- Aggregate functions always return one value per group.
- You cannot pass multiple columns into a single aggregate directly.
- Use
CASEexpressions inside aggregates for conditional calculations.
Total Revenue per Customer
SELECT customer_id,
SUM(quantity * price) AS total_revenue
FROM orders
GROUP BY customer_id;
Count Products Separately
SELECT customer_id,
SUM(CASE WHEN product = 'Laptop' THEN quantity ELSE 0 END) AS total_laptops,
SUM(CASE WHEN product = 'Mouse' THEN quantity ELSE 0 END) AS total_mice
FROM orders
GROUP BY customer_id;
Grouping Data
GROUP BY in SQL is used to arrange identical data into groups so you can apply aggregate functions (like COUNT, SUM, AVG, MIN, MAX) on each group separately.
Think of it as saying: Take all the rows, group them based on this column (or set of columns), and then summarize each group.
General Syntax
SELECT column1, AGGREGATE_FUNCTION(column2)
FROM table_name
WHERE condition
GROUP BY column1;
- Every column in
SELECTthat is not inside an aggregate function must be listed inGROUP BY. GROUP BYcomes afterWHEREbut beforeHAVINGandORDER BY.
Example
| order_id | customer_id | product | quantity | price | order_date |
|---|---|---|---|---|---|
| 1 | C001 | Laptop | 1 | 1000 | 2025-08-01 |
| 2 | C002 | Mouse | 2 | 25 | 2025-08-01 |
| 3 | C001 | Keyboard | 1 | 45 | 2025-08-02 |
| 4 | C003 | Laptop | 1 | 950 | 2025-08-02 |
| 5 | C002 | Mouse | 3 | 25 | 2025-08-03 |
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
GROUP BY customer_id;
Output:
| customer_id | total_orders |
|---|---|
| C001 | 2 |
| C002 | 2 |
| C003 | 1 |
GROUP BY customer_id→ All rows with the samecustomer_idare combined into one group.COUNT(*)counts how many orders are in each group
Key Points to Remember for GROUP BY
- Use
GROUP BYwhen you want one row per group instead of one row per record. - All non-aggregated columns in
SELECTmust be inGROUP BY. HAVINGis for group-level filtering;WHEREis for row-level filtering.- You can group by multiple columns or expressions.
Filtering Groups
HAVING is used to filter groups after aggregation has happened. It works with GROUP BY and aggregate functions like COUNT, SUM, AVG, MIN, MAX. WHERE filters rows before grouping, while HAVING filters aggregated results after grouping.
Syntax of HAVING
SELECT column1, AGGREGATE_FUNCTION(column2)
FROM table_name
WHERE row_condition
GROUP BY column1
HAVING aggregate_condition
ORDER BY column1;
Example of HAVING
Why not use WHERE?
Let’s say we want to find customers who ordered more than 2 items total.
This fails:
SELECT customer_id, SUM(quantity)
FROM orders
WHERE SUM(quantity) > 2 -- ❌ INVALID: WHERE cannot use aggregates
GROUP BY customer_id;
Because SUM(quantity) doesn’t exist until after grouping.
This works:
SELECT customer_id, SUM(quantity) AS total_items
FROM orders
GROUP BY customer_id
HAVING SUM(quantity) > 2;
Output:
| customer_id | total_items |
|---|---|
| C002 | 5 |
Explanation
- Group orders by customer_id.
- Calculate SUM(quantity) for each group.
- Keep only groups where SUM(quantity) > 2.
HAVING without GROUP BY
You can use HAVING without GROUP BY. In that case, it treats the entire table as one group.
Example: Find if total sales exceed 2000:
SELECT SUM(quantity * price) AS total_sales
FROM orders
HAVING SUM(quantity * price) > 2000;
It will show total_sales is 2120..
Multiple Conditions in HAVING
You can filter on more than one aggregate condition.
Example: Customers who ordered more than 2 items and placed at least 2 orders:
SELECT customer_id,
COUNT(order_id) AS order_count,
SUM(quantity) AS total_items
FROM orders
GROUP BY customer_id
HAVING SUM(quantity) > 2
AND COUNT(order_id) >= 2;
WHERE vs HAVING
| Feature | WHERE | HAVING |
|---|---|---|
| Filters rows | ✅ Yes | ❌ No |
| Filters groups | ❌ No | ✅ Yes |
| Can use aggregates | ❌ No | ✅ Yes |
| Executes | Before grouping | After grouping |
Join
A JOIN combines rows from two or more tables based on a related column (usually a foreign key relationship).
You tell SQL:
Match rows from these tables where a specific condition is true, and return the combined data.
Types of Joins
| Join Type | Returns… |
|---|---|
| INNER JOIN | Only rows where there’s a match in both tables |
| LEFT JOIN | All rows from the left table, plus matched rows from the right (NULL if no match) |
| RIGHT JOIN | All rows from the right table, plus matched rows from the left (NULL if no match) |
| FULL OUTER JOIN | All rows from both tables, matched where possible, NULL where no match |
Example Tables for Join
customers
| customer_id | name | country |
|---|---|---|
| 1 | Alice | USA |
| 2 | Bob | UK |
| 3 | Charlie | Canada |
| 4 | Diana | USA |
orders
| order_id | customer_id | product | amount |
|---|---|---|---|
| 101 | 1 | Laptop | 1200 |
| 102 | 2 | Mouse | 25 |
| 103 | 1 | Keyboard | 45 |
| 104 | 3 | Monitor | 200 |
| 105 | 5 | Desk Chair | 150 |
Use the smallest dataset first.
INNER JOIN
Returns only customers who have orders.
SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id;
| customer_id | name | product | amount |
|---|---|---|---|
| 1 | Alice | Laptop | 1200 |
| 1 | Alice | Keyboard | 45 |
| 2 | Bob | Mouse | 25 |
| 3 | Charlie | Monitor | 200 |
INNER JOINonly includes rows wherecustomer_idexists in bothcustomersandorders.- Diana is excluded (no order), and order 105 is excluded (customer_id=5 not in customers).
LEFT JOIN
Returns all customers, plus their orders if any.
SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
LEFT JOIN orders o
ON c.customer_id = o.customer_id;
| customer_id | name | product | amount |
|---|---|---|---|
| 1 | Alice | Laptop | 1200 |
| 1 | Alice | Keyboard | 45 |
| 2 | Bob | Mouse | 25 |
| 3 | Charlie | Monitor | 200 |
| 4 | Diana | NULL | NULL |
- Diana shows up with
NULLvalues forproductandamountbecause she has no orders. - This is useful when you want all left-side records regardless of matches.
RIGHT JOIN
Returns all orders, plus customer details if any.
SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
RIGHT JOIN orders o
ON c.customer_id = o.customer_id;
| customer_id | name | product | amount |
|---|---|---|---|
| 1 | Alice | Laptop | 1200 |
| 1 | Alice | Keyboard | 45 |
| 2 | Bob | Mouse | 25 |
| 3 | Charlie | Monitor | 200 |
| NULL | NULL | Desk Chair | 150 |
- Order 105 appears, but since
customer_id=5doesn’t exist incustomers, customer details are NULL. - This is essentially the mirror of
LEFT JOIN.
FULL OUTER JOIN
Returns all customers and all orders, matched where possible.
SELECT c.customer_id, c.name, o.product, o.amount
FROM customers c
FULL OUTER JOIN orders o
ON c.customer_id = o.customer_id;
| customer_id | name | product | amount |
|---|---|---|---|
| 1 | Alice | Laptop | 1200 |
| 1 | Alice | Keyboard | 45 |
| 2 | Bob | Mouse | 25 |
| 3 | Charlie | Monitor | 200 |
| 4 | Diana | NULL | NULL |
| NULL | NULL | Desk Chair | 150 |
- Combines the results of
LEFT JOINandRIGHT JOIN. - Shows unmatched customers (like Diana) and unmatched orders (like Desk Chair).
Always use explicit JOIN conditions, not implicit joins via the WHERE clause.
-- Avoid implicit join in WHERE clause:
SELECT orders.order_id, customers.customer_name
FROM orders, customers
WHERE orders.customer_id = customers.customer_id;
-- Use explicit JOIN instead:
SELECT orders.order_id, customers.customer_name
FROM orders
JOIN customers ON orders.customer_id = customers.customer_id;
Implicit join will return a Cartesian product, meaning every order will be combined with every customer, which could lead to a massive number of rows in the result set if the tables are large.
Visual Summary
INNER JOIN: LEFT JOIN: RIGHT JOIN: FULL OUTER JOIN:
A ∩ B A ⊃ (A ∩ B) (A ∩ B) ⊂ B A ∪ B
- INNER JOIN → Only the overlap
- LEFT JOIN → Everything from left table + matches
- RIGHT JOIN → Everything from right table + matches
- FULL OUTER JOIN → Everything from both tabl
Subquery (Nested Query)
A Subquery is a query inside another query.
- It is enclosed in parentheses
( ). - It can appear in the
SELECT,FROM,WHERE, orHAVINGclauses. - The main query (outer query) uses the result of the subquery (inner query).
When to Use Subqueries?
- To break complex problems into smaller, manageable queries.
- When joins are not straightforward.
- To check existence (EXISTS), compare aggregates, or filter with conditions.
Types of Subqueries
Single-row Subquery
- Returns only one row.
- Usually used with comparison operators:
=,<,>,<=,>=,<>.
Example:
Find students who have the same department as Alice.
SELECT student_name
FROM Students
WHERE department_id = (
SELECT department_id
FROM Students
WHERE student_name = 'Alice'
);
- Inner query: Finds Alice’s
department_id. - Outer query: Returns all students in that department.
Multi-row Subquery
- Returns multiple rows.
- Used with operators like
IN,ANY,ALL.
Example:
Find students enrolled in any course in the Science department
SELECT student_name
FROM Students
WHERE student_id IN (
SELECT student_id
FROM Enrollments
WHERE course_id IN (
SELECT course_id
FROM Courses
WHERE department_id = (
SELECT department_id
FROM Departments
WHERE department_name = 'Science'
)
)
);
- Inner-most query: Gets the
department_idof "Science". - Next query: Gets all
course_idbelonging to "Science". - Next query: Gets
student_idof students enrolled in those courses. - Outer query: Finds student names from
Students.
Correlated Subquery
- Inner query depends on the outer query.
- Runs once for each row of the outer query.
Example:
Find students who are enrolled in more than one course.
SELECT s.student_name
FROM Students s
WHERE (
SELECT COUNT(*)
FROM Enrollments e
WHERE e.student_id = s.student_id
) > 1;
- For each student in
Students, the subquery counts how many courses they are enrolled in. - Outer query selects only those with count > 1.
Subqueries in Different Clauses
-
In
SELECTclause (Scalar Subquery)SELECT student_name,
(SELECT department_name
FROM departments d
WHERE d.department_id = s.department_id) AS dept_name
FROM students s;Adds department name directly in the result without a join.
-
In
FROMclause (Derived Table / Inline View)SELECT dept_id, COUNT(*) AS total_students
FROM (
SELECT department_id AS dept_id
FROM students
) AS temp
GROUP BY dept_id;Subquery acts like a temporary table (
temp). -
In
HAVINGclauseSELECT course_id, COUNT(student_id) AS total_enrolled
FROM enrollments
GROUP BY course_id
HAVING COUNT(student_id) > (
SELECT AVG(total)
FROM (
SELECT COUNT(student_id) AS total
FROM enrollments
GROUP BY course_id
) AS sub
);Finds courses where enrollment is greater than the average enrollment across all courses.
Subquery vs Join
Subqueries (especially in the WHERE clause) can often be inefficient, especially if they are correlated subqueries (i.e., where the subquery references columns from the outer query). Whenever possible, replace subqueries with JOIN or EXISTS clauses.
- Subquery: More readable for step-by-step logic, but sometimes slower.
- Join: Usually faster for large datasets, but can be harder to read.
Example: Both return students enrolled in “Math”:
Using Subquery:
SELECT student_name
FROM Students
WHERE student_id IN (
SELECT student_id
FROM Enrollments e
JOIN Courses c ON e.course_id = c.course_id
WHERE c.course_name = 'Math'
);
Using Join:
SELECT DISTINCT s.student_name
FROM Students s
JOIN Enrollments e ON s.student_id = e.student_id
JOIN Courses c ON e.course_id = c.course_id
WHERE c.course_name = 'Math';
UNION in MySQL
The UNION operator is used to combine the results of two or more SELECT queries into a single result set.
Rules of UNION:
- Each
SELECTmust have the same number of columns. - The data types of corresponding columns must be compatible.
- By default,
UNIONremoves duplicate rows. - Use
UNION ALLif you want to keep duplicates.
Example of UNION
-- Students table
CREATE TABLE Students (
student_id INT,
name VARCHAR(100),
email VARCHAR(100)
);
-- Teachers table
CREATE TABLE Teachers (
teacher_id INT,
name VARCHAR(100),
email VARCHAR(100)
);
-- Query using UNION
SELECT name, email FROM Students
UNION
SELECT name, email FROM Teachers;
- Combines student and teacher contacts into one list.
- If a student and teacher share the same email, it will appear only once.
Avoid Using OR in WHERE Clauses
OR can cause MySQL to perform more work than necessary, especially if it causes full table scans. When possible, try to rewrite the query to avoid OR.
-- Avoid using OR in WHERE clause:
SELECT order_id, total_amount FROM orders WHERE total_amount < 100 OR total_amount > 500;
-- Rewrite the query with a UNION:
SELECT order_id, total_amount FROM orders WHERE total_amount < 100
UNION
SELECT order_id, total_amount FROM orders WHERE total_amount > 500;
Example with UNION ALL:
SELECT name, email FROM Students
UNION ALL
SELECT name, email FROM Teachers;
Same as above, but duplicates are kept.
INTERSECT in MySQL
Unlike some databases (like PostgreSQL, Oracle, SQL Server), MySQL does NOT support INTERSECT directly.
But we can simulate it using INNER JOIN or IN clause.
Example of INTERSECT
-- Using INNER JOIN
SELECT s.name, s.email
FROM students s
INNER JOIN teachers t ON s.email = t.email;
OR using subquery:
SELECT name, email FROM Students
WHERE email IN (SELECT email FROM Teachers);
- Finds records that exist in both tables (intersection).
- Works like
INTERSECT.
Comparing UNION vs INTERSECT
| Feature | UNION | INTERSECT (simulated in MySQL) |
|---|---|---|
| Combines rows | ✅ Yes | ❌ No (filters only common rows) |
| Removes duplicates | ✅ By default | ✅ Naturally (since only common rows) |
| Keeps duplicates | ✅ With UNION ALL | ❌ Not applicable |
| Availability in MySQL | ✅ Native support | ❌ Must simulate using JOIN or IN |
Example of UNION and INTERSECT
Suppose we have two mailing lists:
event_registrations(people who registered for an event).newsletter_subscribers(people who subscribed to the newsletter).
- People in either list (UNION):
SELECT email FROM event_registrations
UNION
SELECT email FROM newsletter_subscribers; - People in both lists (INTERSECT):
SELECT email FROM event_registrations
WHERE email IN (SELECT email FROM newsletter_subscribers);
UNIONgives everyone we can contact.INTERSECTgives our most engaged people (registered + subscribed).